{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Sequence Types" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" }, "tags": [ "remove-cell" ] }, "source": [ "**CS1302 Introduction to Computer Programming**\n", "___" ] }, { "cell_type": "code", "execution_count": 65, "metadata": { "ExecuteTime": { "end_time": "2020-11-27T11:20:04.656873Z", "start_time": "2020-11-27T11:20:04.651575Z" }, "slideshow": { "slide_type": "fragment" }, "tags": [ "remove-cell" ] }, "outputs": [], "source": [ "import random\n", "\n", "%reload_ext mytutor" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Motivation of composite data type" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The following code calculates the average of five numbers:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2021-03-20T14:52:00.626044Z", "start_time": "2021-03-20T14:52:00.608190Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "3.0" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def average_five_numbers(n1, n2, n3, n4, n5):\n", " return (n1 + n2 + n3 + n4 + n5) / 5\n", "\n", "\n", "average_five_numbers(1, 2, 3, 4, 5)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "What about using the above function to compute the average household income in Hong Kong. \n", "The labor size in Hong Kong is close to [4 million](https://www.gov.hk/en/about/abouthk/factsheets/docs/employment.pdf).\n", "- Should we create a variable to store the income of each individual?\n", "- Should we recursively apply the function to groups of five numbers?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "What we need is\n", "- a *composite data type* that can keep a variable number of items, so that \n", "- we can then define a function that takes an object of the *composite data type*,\n", "- and returns the average of all items in the object." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**How to store a sequence of items in Python?**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We learned a composite data type that stores a sequence of characters. What is it?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "`tuple` and `list` are two other built-in sequence types for ordered collections of objects. Unlike string, they can store items of possibly different types." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Indeed, we have already used tuples and lists before." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2020-11-02T23:25:35.106582Z", "start_time": "2020-11-02T23:25:35.101478Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%mytutor -h 300\n", "a_list = \"1 2 3\".split()\n", "a_tuple = (lambda *args: args)(1, 2, 3)\n", "a_list[0] = 0\n", "a_tuple[0] = 0" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**What is the difference between tuple and list?**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "- List is [*mutable*](https://docs.python.org/3/library/stdtypes.html#index-21) so programmers can change its items.\n", "- Tuple is [*immutable*](https://docs.python.org/3/glossary.html#term-immutable) like `int`, `float`, and `str`, so\n", " - programmers can be certain the content stay unchanged, and\n", " - Python can preallocate a fixed amount of memory to store its content." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Constructing sequences" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**How to create tuple/list?**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Mathematicians often represent a set of items in two different ways:\n", "1. [Roster notation](https://en.wikipedia.org/wiki/Set_(mathematics)#Roster_notation), which enumerates the elements in the sequence, e.g.,\n", "\n", "$$ \\{0, 1, 4, 9, 16, 25, 36, 49, 64, 81\\} $$\n", "\n", "2. [Set-builder notation](https://en.wikipedia.org/wiki/Set-builder_notation), which describes the content using a rule for constructing the elements, e.g.,\n", "\n", "$$ \\{x^2| x\\in \\mathbb{N}, x< 10 \\}, $$\n", "\n", "namely the set of perfect squares less than 100." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Python also provides two corresponding ways to create a tuple/list: \n", "1. [Enclosure](https://docs.python.org/3/reference/expressions.html?highlight=literals#grammar-token-enclosure)\n", "2. [Comprehension](https://docs.python.org/3/reference/expressions.html#index-12)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**How to create a tuple/list by enumerating its items?**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "To create a tuple, we enclose a comma separated sequence by parentheses:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2020-11-02T23:27:26.558639Z", "start_time": "2020-11-02T23:27:26.554769Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%mytutor -h 450\n", "empty_tuple = ()\n", "singleton_tuple = (0,) # why not (0)?\n", "heterogeneous_tuple = (singleton_tuple, (1, 2.0), print)\n", "enclosed_starred_tuple = (*range(2), *\"23\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Note that:\n", "- If the enclosed sequence has one term, there must be a comma after the term.\n", "- The elements of a tuple can have different types.\n", "- The unpacking operator `*` can unpack an iterable into a sequence in an enclosure." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "To create a list, we use square brackets to enclose a comma separated sequence of objects." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2020-11-02T23:29:55.099284Z", "start_time": "2020-11-02T23:29:55.092488Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%mytutor -h 450\n", "empty_list = []\n", "singleton_list = [0] # no need to write [0,]\n", "heterogeneous_list = [singleton_list, (1, 2.0), print]\n", "enclosed_starred_list = [*range(2), *\"23\"]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We can also create a tuple/list from other iterables using the constructors `tuple`/`list` as well as addition and multiplication similar to `str`." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "ExecuteTime": { "end_time": "2020-11-02T23:31:26.431382Z", "start_time": "2020-11-02T23:31:26.426487Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%mytutor -h 950\n", "str2list = list(\"Hello\")\n", "str2tuple = tuple(\"Hello\")\n", "range2list = list(range(5))\n", "range2tuple = tuple(range(5))\n", "tuple2list = list((1, 2, 3))\n", "list2tuple = tuple([1, 2, 3])\n", "concatenated_tuple = (1,) + (2, 3)\n", "concatenated_list = [1, 2] + [3]\n", "duplicated_tuple = (1,) * 2\n", "duplicated_list = 2 * [1]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Exercise** Explain the difference between following two expressions. Why a singleton tuple must have a comma after the item." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "ExecuteTime": { "end_time": "2020-11-02T23:31:48.052688Z", "start_time": "2020-11-02T23:31:48.048349Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "6\n", "(3, 3)\n" ] } ], "source": [ "print((1 + 2) * 2, (1 + 2,) * 2, sep=\"\\n\")" ] }, { "cell_type": "markdown", "metadata": { "nbgrader": { "grade": true, "grade_id": "singleton-tuple", "locked": false, "points": 0, "schema_version": 3, "solution": true, "task": false }, "slideshow": { "slide_type": "-" } }, "source": [ "`(1+2)*2` evaluates to `6` but `(1+2,)*2` evaluates to `(3,3)`. \n", "- The parentheses in `(1+2)` indicate the addition needs to be performed first, but \n", "- the parentheses in `(1+2,)` creates a tuple. \n", "\n", "Hence, singleton tuple must have a comma after the item to differentiate these two use cases." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**How to use a rule to construct a tuple/list?**" ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2020-10-29T00:11:10.722819Z", "start_time": "2020-10-29T00:11:10.718451Z" }, "slideshow": { "slide_type": "fragment" } }, "source": [ "We can specify the rule using a [comprehension](https://docs.python.org/3/reference/expressions.html#index-12), \n", "which we have used in a [generator expression](https://docs.python.org/3/reference/expressions.html#index-22). \n", "E.g., the following is a python one-liner that returns a generator for prime numbers." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2020-11-02T23:36:56.247594Z", "start_time": "2020-11-02T23:36:56.233173Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97\n" ] }, { "data": { "text/plain": [ "\u001b[0;31mSignature:\u001b[0m \u001b[0mall\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0miterable\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m/\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mDocstring:\u001b[0m\n", "Return True if bool(x) is True for all values x in the iterable.\n", "\n", "If the iterable is empty, return True.\n", "\u001b[0;31mType:\u001b[0m builtin_function_or_method\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "all?\n", "prime_sequence = lambda stop: (\n", " x for x in range(2, stop) if all(x % divisor for divisor in range(2, x))\n", ")\n", "print(*prime_sequence(100))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "There are two comprehensions used:\n", "- In `all(x % divisor for divisor in range(2, x))`, the comprehension creates a generator of remainders to the function `all`, which returns `True` if all the remainders are non-zero else `False`.\n", "- In the return value `(x for x in range(2, stop) if ...)` of the anonymous function, the comprehension creates a generator of numbers from 2 to `stop-1` that satisfy the condition of the `if` clause. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Exercise** Use comprehension to define a function `composite_sequence` that takes a non-negative integer `stop` and returns a generator of composite numbers strictly smaller than `stop`. Use `any` instead of `all` to check if a number is composite." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "ExecuteTime": { "end_time": "2020-11-02T23:36:33.954168Z", "start_time": "2020-11-02T23:36:33.932818Z" }, "nbgrader": { "grade": false, "grade_id": "composite_sequence", "locked": false, "schema_version": 3, "solution": true, "task": false }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4 6 8 9 10 12 14 15 16 18 20 21 22 24 25 26 27 28 30 32 33 34 35 36 38 39 40 42 44 45 46 48 49 50 51 52 54 55 56 57 58 60 62 63 64 65 66 68 69 70 72 74 75 76 77 78 80 81 82 84 85 86 87 88 90 91 92 93 94 95 96 98 99\n" ] } ], "source": [ "any?\n", "### BEGIN SOLUTION\n", "composite_sequence = lambda stop: (\n", " x for x in range(2, stop) if any(x % divisor == 0 for divisor in range(2, x))\n", ")\n", "### END SOLUTION\n", "\n", "print(*composite_sequence(100))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We can construct a list instead of a generator using [list comprehension](https://docs.python.org/3/glossary.html#term-list-comprehension):" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[x ** 2 for x in range(10)] # Enclose comprehension by brackets" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "**Is the list comprehension the same as applying `list` to a generator expression?**" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(x ** 2 for x in range(10)) # Enclose comprehension by brackets" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "List comprehension is more efficient as it does not need to create generator first:" ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1.99 µs ± 35.8 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)\n" ] } ], "source": [ "%%timeit\n", "[x ** 2 for x in range(10)]" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2.55 µs ± 317 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n" ] } ], "source": [ "%%timeit\n", "list(x ** 2 for x in range(10))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Exercise** The following are two different ways to use comprehension to construct a tuple. Which one is faster? Try predicting the results before running them." ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4.41 µs ± 772 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n" ] } ], "source": [ "%%timeit\n", "tuple(x for x in range(100))" ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2.57 µs ± 156 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n" ] } ], "source": [ "%%timeit\n", "tuple([x for x in range(100)])" ] }, { "cell_type": "markdown", "metadata": { "nbgrader": { "grade": true, "grade_id": "generator-vs-tuple", "locked": false, "points": 0, "schema_version": 3, "solution": true, "task": false }, "slideshow": { "slide_type": "-" } }, "source": [ "The second method is often faster because the list of items can be created faster with list comprehension instead of generator expression. This benefits appear to out-weight the cost in converting a list to a tuple." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "With list comprehension, we can simulate a sequence of biased coin flips." ] }, { "cell_type": "code", "execution_count": 138, "metadata": { "ExecuteTime": { "end_time": "2020-11-02T23:42:30.880408Z", "start_time": "2020-11-02T23:42:30.832881Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Chance of head: 0.2976931267462506\n", "Coin flips: T T T H T T T T T H T T H H T H H T H H T T T T T T T T T T T T T H T H T T H T T T T T T T T T H T T T T T T H T T H T H H T H T T T T H H T T T T T T T T T T H T H T H T T T T T H H H T H T H T T T T T T T T T T T T T T T T T H T H H T T T H T T H T H T T T H T T T T T H T H T T T T T T T T T T T T H T H T T T T T T T H T T T T H H T T H H T T H H H T T T T H T H T T T T T T H T T T T T T T T T T T T T T H T H H H T T H T T H T T T T H H T T T T T H T T H H T T H T H T H H T H T T H T T H T T T T H H T T T H T T T H T T T T T T T T H H T T T H T T T T H T T H H T T T T T T T H T H H T T H H H T H T T T T T T H T T T T T T H T H T T H T H T T T H T T T T H T T H T H T T T H T T T H H T T H T H H T T T T T T H H T H T T T H T T H H T T H T H T H T T T H T T T H H H T T T T T T T H T T H T T T H T T T T H T T T H T T T H T T T H T H T H T T H H T T T T T T H H T H H T T T T T H T T H T H T T H T T H T T H H T H H T H H H T H T T T T T T T H T T T H T H T H H H T T T T H H T T H T H T T T H H T H T H T T T T T H T H H T T T T H H T H H T T H H T T T T T H T T H T H T T H H T H T T H T H T T T T T H T T T H T T T T T T T H T H T T H H T H H T T T T T H T T T H H T T H T T H T T T T T H T T H H T T T H T T H H T T T T H H T T T T T T T H T H T H H T H H T H T T T H T T H H H T T T T T T H T T H T T T T T H T H T T T H T T T T T T H T T H H T T T H T T H T T H T T T T T T T T T H H T T T H T T T T T H H T H T H T H T T H H T T T T T T T T T T T H T T T T T T T T T H T T T H H H T T T H H T T T T H T T T T H T T H T T T T T H T T T T T T H T T H T T T T T H T H T H T T H T T T T T H T T T T T H T T T H T H H T H T H T T T T T H H T T T T T H T T T H H T H T H T H T T H T T T T T H T T T T H T H H T T H T T T T H H T T T T H T H H H H T T H T H T T T T T H T T T T H T T T T T T T T H H T T T T H T H H H T H T T H H T T T H H H T T H H T T T T T T T T T H T T H H T H T T T T T T T T H T H T T H T T H T H T T T T T T T T T H T H T T T H T H T H T H T T T T T T H T H T T T T\n" ] } ], "source": [ "from random import random as rand\n", "\n", "p = rand() # unknown bias\n", "coin_flips = [\"H\" if rand() <= p else \"T\" for i in range(1000)]\n", "print(\"Chance of head:\", p)\n", "print(\"Coin flips:\", *coin_flips)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We can then estimate the bias by the fraction of heads coming up." ] }, { "cell_type": "code", "execution_count": 139, "metadata": { "ExecuteTime": { "end_time": "2020-11-02T23:43:05.198459Z", "start_time": "2020-11-02T23:43:05.193224Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Fraction of heads: 0.304\n" ] } ], "source": [ "def average(seq):\n", " return sum(seq) / len(seq)\n", "\n", "\n", "head_indicators = [1 if outcome == \"H\" else 0 for outcome in coin_flips]\n", "fraction_of_heads = average(head_indicators)\n", "print(\"Fraction of heads:\", fraction_of_heads)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Note that `sum` and `len` returns the sum and length of the sequence." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "**Exercise** Define a function `variance` that takes in a sequence `seq` and returns the [variance](https://en.wikipedia.org/wiki/Variance) of the sequence." ] }, { "cell_type": "code", "execution_count": 140, "metadata": { "ExecuteTime": { "end_time": "2020-11-02T23:43:51.901668Z", "start_time": "2020-11-02T23:43:51.897232Z" }, "nbgrader": { "grade": false, "grade_id": "variance", "locked": false, "schema_version": 3, "solution": true, "task": false }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "95% confidence interval: [0.27,0.33]\n" ] } ], "source": [ "def variance(seq):\n", " ### BEGIN SOLUTION\n", " return sum(i ** 2 for i in seq) / len(seq) - average(seq) ** 2\n", " ### END SOLUTION\n", "\n", "\n", "delta = (variance(head_indicators) / len(head_indicators)) ** 0.5\n", "print(\"95% confidence interval: [{:.2f},{:.2f}]\".format(p - 2 * delta, p + 2 * delta))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Selecting items in a sequence" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**How to traverse a tuple/list?**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Instead of calling the dunder method directly, we can use a for loop to iterate over all the items in order." ] }, { "cell_type": "code", "execution_count": 54, "metadata": { "ExecuteTime": { "end_time": "2020-11-02T23:45:55.687173Z", "start_time": "2020-11-02T23:45:55.681215Z" }, "scrolled": true, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 1 2 3 4 " ] } ], "source": [ "a = (*range(5),)\n", "for item in a:\n", " print(item, end=\" \")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "To do it in reverse, we can use the `reversed` function." ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "ExecuteTime": { "end_time": "2020-11-02T23:45:16.488066Z", "start_time": "2020-11-02T23:45:16.477429Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4 3 2 1 0 " ] } ], "source": [ "reversed?\n", "a = [*range(5)]\n", "for item in reversed(a):\n", " print(item, end=\" \")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We can also traverse multiple tuples/lists simultaneously by `zip`ping them." ] }, { "cell_type": "code", "execution_count": 56, "metadata": { "ExecuteTime": { "end_time": "2020-11-02T23:46:12.766014Z", "start_time": "2020-11-02T23:46:12.751946Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 4\n", "1 3\n", "2 2\n", "3 1\n", "4 0\n" ] } ], "source": [ "zip?\n", "a = (*range(5),)\n", "b = reversed(a)\n", "for item1, item2 in zip(a, b):\n", " print(item1, item2)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**How to select an item in a sequence?**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Sequence objects such as `str`/`tuple`/`list` implements the [*getter method* `__getitem__`](https://docs.python.org/3/reference/datamodel.html#object.__getitem__) to return their items." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We can select an item of a sequence `a` by [subscription](https://docs.python.org/3/reference/expressions.html#subscriptions) \n", "```Python\n", "a[i]\n", "``` \n", "where `a` is a list and `i` is an integer index." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "A non-negative index indicates the distance from the beginning." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "$$\\boldsymbol{a} = (a_0, ... , a_{n-1})$$" ] }, { "cell_type": "code", "execution_count": 72, "metadata": { "ExecuteTime": { "end_time": "2020-11-02T23:47:38.089722Z", "start_time": "2020-11-02T23:47:38.080226Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)\n", "Length: 10\n", "First element: 0\n", "Second element: 1\n", "Last element: 9\n" ] }, { "ename": "IndexError", "evalue": "tuple index out of range", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mIndexError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m/tmp/ipykernel_411/3903788463.py\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Second element:'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 6\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Last element:'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 7\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# IndexError\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mIndexError\u001b[0m: tuple index out of range" ] } ], "source": [ "a = (*range(10),)\n", "print(a)\n", "print(\"Length:\", len(a))\n", "print(\"First element:\", a[0])\n", "print(\"Second element:\", a[1])\n", "print(\"Last element:\", a[len(a) - 1])\n", "print(a[len(a)]) # IndexError" ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2020-10-27T14:55:28.986812Z", "start_time": "2020-10-27T14:55:28.980088Z" }, "slideshow": { "slide_type": "fragment" } }, "source": [ "`a[i]` with `i >= len(a)` results in an `IndexError`. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "A negative index represents a negative offset from an imaginary element one past the end of the sequence." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "$$\\begin{aligned} \\boldsymbol{a} &= (a_0, ... , a_{n-1})\\\\\n", "& = (a_{-n}, ..., a_{-1})\n", "\\end{aligned}$$" ] }, { "cell_type": "code", "execution_count": 149, "metadata": { "ExecuteTime": { "end_time": "2020-11-02T23:48:34.920475Z", "start_time": "2020-11-02T23:48:34.906520Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\n", "Last element: 9\n", "Second last element: 8\n", "First element: 0\n" ] }, { "ename": "IndexError", "evalue": "list index out of range", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mIndexError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Second last element:'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'First element:'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 6\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# IndexError\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mIndexError\u001b[0m: list index out of range" ] } ], "source": [ "a = [*range(10)]\n", "print(a)\n", "print(\"Last element:\", a[-1])\n", "print(\"Second last element:\", a[-2])\n", "print(\"First element:\", a[-len(a)])\n", "print(a[-len(a) - 1]) # IndexError" ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2020-10-29T04:10:06.523676Z", "start_time": "2020-10-29T04:10:06.517287Z" }, "slideshow": { "slide_type": "fragment" } }, "source": [ "`a[i]` with `i < -len(a)` results in an `IndexError`. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**How to select multiple items?**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We can use [slicing](https://docs.python.org/3/reference/expressions.html#slicings) to select a range of items as follows:\n", "```Python\n", "a[start:stop]\n", "a[start:stop:step]\n", "```\n", "\n", "The selected items corresponds to those indexed using `range`:\n", "\n", "```Python\n", "(a[i] for i in range(start, stop))\n", "(a[i] for i in range(start, stop, step))\n", "```" ] }, { "cell_type": "code", "execution_count": 73, "metadata": { "ExecuteTime": { "end_time": "2020-11-02T23:49:23.393787Z", "start_time": "2020-11-02T23:49:23.389376Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1, 2, 3)\n", "(1, 3)\n" ] } ], "source": [ "a = (*range(10),)\n", "print(a[1:4])\n", "print(a[1:4:2])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Unlike `range`, the parameters for slicing take their default values if missing or equal to None:" ] }, { "cell_type": "code", "execution_count": 59, "metadata": { "ExecuteTime": { "end_time": "2020-11-02T23:49:36.191993Z", "start_time": "2020-11-02T23:49:36.188102Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0, 1, 2, 3]\n", "[1, 2, 3, 4, 5, 6, 7, 8, 9]\n", "[1, 2, 3]\n" ] } ], "source": [ "a = [*range(10)]\n", "print(a[:4]) # start defaults to 0\n", "print(a[1:]) # stop defaults to len(a)\n", "print(a[1:4:]) # step defaults to 1" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The parameters can also take negative values:" ] }, { "cell_type": "code", "execution_count": 61, "metadata": { "ExecuteTime": { "end_time": "2020-11-02T23:49:59.510025Z", "start_time": "2020-11-02T23:49:59.505499Z" }, "scrolled": true, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[9]\n", "[0, 1, 2, 3, 4, 5, 6, 7, 8]\n", "[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]\n" ] } ], "source": [ "print(a[-1:])\n", "print(a[:-1])\n", "print(a[::-1]) # What are the default values used here?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "A mixture of negative and postive values are also okay:" ] }, { "cell_type": "code", "execution_count": 155, "metadata": { "ExecuteTime": { "end_time": "2020-11-02T23:51:22.313831Z", "start_time": "2020-11-02T23:51:22.308366Z" }, "scrolled": true, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[]\n", "[1, 2, 3, 4, 5, 6, 7, 8]\n", "[]\n", "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\n" ] } ], "source": [ "print(a[-1:1]) # equal [a[-1], a[0]]?\n", "print(a[1:-1]) # equal []?\n", "print(a[1:-1:-1]) # equal [a[1], a[0]]?\n", "print(a[-100:100]) # result in IndexError like subscription?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Exercise** (Challenge) Complete the following function to return a tuple `(start, stop, step)` such that `range(start, stop, step)` gives the non-negative indexes of the sequence of elements selected by `a[i:j:k]`.\n", "\n", "*Hint:* See [note 3-5 in the python documentation](https://docs.python.org/3/library/stdtypes.html#common-sequence-operations)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "nbgrader": { "grade": false, "grade_id": "sss", "locked": false, "schema_version": 3, "solution": true, "task": false }, "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "def sss(a, i=None, j=None, k=None):\n", " ### BEGIN SOLUTION\n", " l = len(a)\n", " step = 1 if k is None else k\n", " m = l if step > 0 else l - 1\n", " start = 0 if i is None else min(i if i > 0 else max(i + l, 0), m)\n", " stop = l if j is None else min(j if j > 0 else max(j + l, 0), m)\n", " ### END SOLUTION\n", " return start, stop, step\n", "\n", "\n", "a = [*range(10)]\n", "assert sss(a, -1, 1) == (9, 1, 1)\n", "assert sss(a, 1, -1) == (1, 9, 1)\n", "assert sss(a, 1, -1, -1) == (1, 9, -1)\n", "assert sss(a, -100, 100) == (0, 10, 1)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Exercise** With slicing, we can now implement a practical sorting algorithm called [quicksort](https://en.wikipedia.org/wiki/Quicksort) to sort a sequence. Explain how the code works:" ] }, { "cell_type": "code", "execution_count": 101, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[84, 89, 28, 26, 86, 95, 73, 44, 98, 60]\n", "[26, 28, 44, 60, 73, 84, 86, 89, 95, 98]\n" ] } ], "source": [ "def quicksort(seq):\n", " \"\"\"Return a sorted list of items from seq.\"\"\"\n", " if len(seq) <= 1:\n", " return list(seq)\n", " i = random.randint(0, len(seq) - 1)\n", " pivot, others = seq[i], [*seq[:i], *seq[i + 1 :]]\n", " left = quicksort([x for x in others if x < pivot])\n", " right = quicksort([x for x in others if x >= pivot])\n", " return [*left, pivot, *right]\n", "\n", "\n", "seq = [random.randint(0, 99) for i in range(10)]\n", "print(seq, quicksort(seq), sep=\"\\n\")" ] }, { "cell_type": "markdown", "metadata": { "nbgrader": { "grade": true, "grade_id": "quick-sort", "locked": false, "points": 0, "schema_version": 3, "solution": true, "task": false }, "slideshow": { "slide_type": "-" } }, "source": [ "The above recursion creates a sorted list as `[*left, pivot, *right]` where\n", "- `pivot` is a randomly selected item in `seq`,\n", "- `left` is the sorted list of items smaller than `pivot`, and\n", "- `right` is the sorted list of items no smaller than `pivot`.\n", "\n", "The base case happens when `seq` contains at most one item, in which case `seq` is already sorted." ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.12" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autoclose": false, "autocomplete": true, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 1, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false }, "rise": { "enable_chalkboard": true, "scroll": true, "theme": "white" }, "toc": { "base_numbering": 1, "nav_menu": { "height": "195px", "width": "330px" }, "number_sections": true, "sideBar": true, "skip_h1_title": true, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": { "height": "454.418px", "left": "1533px", "top": "110.284px", "width": "260.994px" }, "toc_section_display": true, "toc_window_display": false }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }